Bid and Ask Reconstruction
(draft version)


Piotr Lipiński
Computational Intelligence Research Group, Institute of Computer Science, University of Wroclaw, Poland
lipinski@cs.uni.wroc.pl

Abstract:

This notebook defines the assignment on reconstructing bid and ask data from transaction data.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

%matplotlib inline

Input data

In the stock market with the order driven trading mechanism, when a trader submits a buy (or sell) order to the stock market, his order is matched with other registered orders and if there is an opposite order - someone declares to sell (or buy) the required number of stocks at the required price - the transaction is executed, otherwise his order is registered in the order book and waits for new incoming orders. Bid is the price of the best buy order registered in the order book and ask is the price of the best sell order registered in the order book. Assuming some simplification, transaction prices should be usually equal either to bid or to ask (in practice, some large orders are splitted and matched partially to a number of opposite orders, so the final price may differ from bid and ask).

Input data contains the transaction data as well as the bid and ask data for a selected stock from the London Stock Exchange over a selected period.

In [2]:
df_bid_ask = pd.read_csv('bid_ask.csv', index_col='timestamp', parse_dates=['timestamp'])
display(df_bid_ask)
bid ask
timestamp
2013-09-02 08:30:00 2994.5 2995.5
2013-09-02 08:31:00 2997.0 2998.0
2013-09-02 08:32:00 2996.5 2998.0
2013-09-02 08:33:00 2996.0 2998.0
2013-09-02 08:34:00 2997.0 2998.5
... ... ...
2013-09-13 15:56:00 3118.5 3119.0
2013-09-13 15:57:00 3119.0 3119.5
2013-09-13 15:58:00 3122.0 3122.5
2013-09-13 15:59:00 3122.5 3123.5
2013-09-13 16:00:00 3123.5 3124.0

4510 rows × 2 columns

In [3]:
df_transaction = pd.read_csv('transaction.csv', index_col='timestamp', parse_dates=['timestamp'])
display(df_transaction)
transaction
timestamp
2013-09-02 08:32:00 2998.0
2013-09-02 08:33:00 2998.0
2013-09-02 08:34:00 2998.5
2013-09-02 08:35:00 2999.0
2013-09-02 08:38:00 3020.0
... ...
2013-09-13 15:54:00 3119.0
2013-09-13 15:55:00 3118.5
2013-09-13 15:56:00 3118.5
2013-09-13 15:57:00 3119.0
2013-09-13 16:00:00 3123.5

2221 rows × 1 columns

In [4]:
plt.figure(figsize=(12, 4))
df_bid_ask['bid']['2013-09-02 9:00':'2013-09-02 10:00'].plot(label='bid')
df_bid_ask['ask']['2013-09-02 9:00':'2013-09-02 10:00'].plot(label='ask')
df_transaction['transaction']['2013-09-02 9:00':'2013-09-02 10:00'].plot(label='transaction', style='o')
plt.xlabel('Time')
plt.ylabel('Price')
plt.title('Bid and Ask')
plt.legend(loc='upper left')
plt.show()

Assignment

Please propose an approach to reconstruct bid and ask data from transaction data. Select one or more time periods for training the approach. In training, you may use the transaction data as well as the bid and ask data. Select one or more time periods for testing the approach. In testing, your approach should use only the transaction data.

Example

In [5]:
selection = df_transaction['2013-09-02 9:00':'2013-09-02 10:00'].index.values
df_bid_ask_reconstructed = pd.DataFrame(np.zeros((len(selection), 2)), index=selection, columns=['bid', 'ask'])
df_bid_ask_reconstructed['bid'] = df_transaction.loc[selection]['transaction'] - 0.1 * df_transaction.loc[selection]['transaction'].std()
df_bid_ask_reconstructed['ask'] = df_transaction.loc[selection]['transaction'] + 0.1 * df_transaction.loc[selection]['transaction'].std()

plt.figure(figsize=(12, 4))
df_bid_ask_reconstructed['bid'].plot(label='bid reconstructed')
df_bid_ask_reconstructed['ask'].plot(label='ask reconstructed')
# df_bid_ask['bid'].loc[selection].plot(label='bid original', color='#808080')
# df_bid_ask['ask'].loc[selection].plot(label='ask original', color='#808080')
df_transaction['transaction'].loc[selection].plot(label='transaction', style='o')
plt.xlabel('Time')
plt.ylabel('Price')
plt.title('Bid and Ask')
plt.legend(loc='upper left')
plt.show()
In [ ]: